Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
💾 Prompt Caching
Context Reuse, KV Cache, Inference Optimization, Token Efficiency
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
20712
posts in
243.5
ms
1M
token context: The good, the bad and the
ugly
(2025)
micron.com
·
2d
·
Discuss:
Hacker News
🏗️
LLM Infrastructure
SalesforceAIResearch/promptomatix
: An Automatic Prompt Optimization Framework for Large Language Models
github.com
·
2d
🪄
Prompt Engineering
ProphetKV
: User-Query-Driven Selective
Recomputation
for Efficient KV Cache Reuse in Retrieval-Augmented Generation
arxiv.org
·
3d
⚡
Vectorized Execution
Achieving
Ultra-Fast AI Chat
Widgets
cjroth.com
·
6h
·
Discuss:
Hacker News
🪄
Prompt Engineering
What I
wish
I
knew
before building a vibe coding platform
imagine.dev
·
1d
·
Discuss:
Hacker News
🔄
Incremental Computation
When Language Models Get Stuck: The Mechanics of
Repetition
Loops
pub.towardsai.net
·
1d
🔤
Tokenization
Performance
Tip
of the Week #79: Make at most one
tradeoff
at a time
abseil.io
·
5h
⚙️
Mechanical Sympathy
Show HN:
LocalGPT
– A local-first AI assistant in Rust with
persistent
memory
news.ycombinator.com
·
3m
·
Discuss:
Hacker News
🔎
Tantivy
Speeding
Up
HTML
Generation by 2000%
bobrubbens.nl
·
2d
🛠️
Build Optimization
How we cut
Vertex
AI latency by 35% with
GKE
Inference Gateway
cloud.google.com
·
1d
🧠
Inference Serving
Thread by @
ClaudeCodeLog
on Thread
Reader
App
threadreaderapp.com
·
10h
🔌
Claude Plugins
Profiling
Go programs with
pprof
jvns.ca
·
4h
🔬
Rust Profiling
Optimized
LLM Inference
Engines
rishirajacharya.com
·
3d
🏗️
LLM Infrastructure
A Guide to
Effective
Prompt
Engineering
blog.bytebytego.com
·
3d
🪄
Prompt Engineering
When Clever Hardware Hacks Bite Back: A Password
Keeper
Device
Autopsy
hackaday.com
·
4h
🔓
Hacking
NixOS: Systemd unit-linking script rewrite for 60x
speedups
by
Profpatsch
· Pull Request #479442
github.com
·
11h
·
Discuss:
Hacker News
💫
IO_uring
Kilo
Claw
: Hosted
OpenClaw
in 60 Seconds
blog.kilo.ai
·
12h
🔍
Quickwit
How I
squeezed
a
BERT
sentiment analyzer into 1GB RAM on a $5 VPS
mohammedeabdelaziz.github.io
·
12h
·
Discuss:
Hacker News
🏗️
LLM Infrastructure
Oatmeal
-
Constraint
propagation for fun
eli.li
·
1h
🧮
SMT Solvers
How Meta turned the Linux Kernel into a planet-scale
Load
Balancer
. Part I
softwarefrontier.substack.com
·
11h
·
Discuss:
Substack
🔗
High-Speed Networking
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help